由于图神经网络(GNN)的成功和异质信息网络的广泛应用,近年来,异质图学习近年来引起了极大的关注。已经提出了各种异质图神经网络,以概括GNN来处理异质图。不幸的是,这些方法通过各种复杂的模块对异质性进行建模。本文旨在提出一个简单而有效的框架,以使均质GNN具有足够的处理异质图的能力。具体而言,我们提出了基于关系嵌入的图形神经网络(RE-GNNS),该图形仅使用一个参数来嵌入边缘类型关系和自动连接的重要性。为了同时优化这些关系嵌入和其他参数,提出了一个梯度缩放因子来约束嵌入以收敛到合适的值。此外,我们从理论上证明,与基于元路径的异质GNN相比,我们的RE-GNN具有更高的表现力。关于节点分类任务的广泛实验验证了我们提出的方法的有效性。
translated by 谷歌翻译
工作记忆(WM)表示在脑海中存储的信息,是人类认知领域的一个基本研究主题。可以监测大脑的电活动的脑电图(EEG)已被广泛用于测量WM的水平。但是,关键的挑战之一是个体差异可能会导致无效的结果,尤其是当既定模型符合陌生主题时。在这项工作中,我们提出了一个具有空间注意力(CS-DASA)的跨主题深层适应模型,以概括跨科目的工作负载分类。首先,我们将EEG时间序列转换为包含空间,光谱和时间信息的多帧EEG图像。首先,CS-DASA中的主题共享模块从源和目标主题中接收多帧的EEG图像数据,并学习了共同的特征表示。然后,在特定于主题的模块中,实现了最大平均差异,以测量重现的内核希尔伯特空间中的域分布差异,这可以为域适应增加有效的罚款损失。此外,采用主题对象的空间注意机制专注于目标图像数据的判别空间特征。在包含13个受试者的公共WM EEG数据集上进行的实验表明,所提出的模型能够达到比现有最新方法更好的性能。
translated by 谷歌翻译
链接预测的任务旨在解决由于难以从现实世界中收集事实而引起的不完整知识的问题。基于GCN的模型由于其复杂性而广泛应用于解决链接预测问题,但基于GCN的模型在结构和培训过程中遇到了两个问题。 1)GCN层的转化方法在基于GCN的知识表示模型中变得越来越复杂; 2)由于知识图收集过程的不完整,标记为负样本中有许多未收集的真实事实。因此,本文研究了相邻节点的信息聚合系数(自我注意)的特征,并重新设计了GAT结构的自我注意力。同时,受到人类思维习惯的启发,我们在预训练的模型上设计了一种半监督的自训练方法。基准数据集FB15K-237和WN18RR上的实验结果表明,我们提出的自我发项机制和半监督的自我训练方法可以有效地提高链接预测任务的性能。例如,如果您查看FB15K-237,则建议的方法将@1的命中率提高了约30%。
translated by 谷歌翻译
风能供应的可变性可能会给将风力发电纳入网格系统带来重大挑战。因此,风力预测(WPF)已被广泛认为是风能整合和操作中最关键的问题之一。在过去的几十年中,关于风能预测问题的研究爆炸了。然而,如何很好地处理WPF问题仍然具有挑战性,因为始终要求高预测准确性以确保电网稳定性和供应的安全性。我们提出了独特的空间动态风能预测数据集:SDWPF,其中包括风力涡轮机的空间分布以及动态上下文因素。鉴于,大多数现有数据集只有少量的风力涡轮机,而无需以细粒度的时间尺度了解风力涡轮机的位置和上下文信息。相比之下,SDWPF提供了半年多的风力涡轮机的风能数据,其相对位置和内部地位。我们使用此数据集启动BAIDU KDD杯2022来检查当前WPF解决方案的极限。该数据集在https://aistudio.baidu.com/aistudio/competition/detail/152/0/datasets上发布。
translated by 谷歌翻译
在本文中,我们为音乐驱动的舞蹈运动综合构成了一个新颖的框架,并具有可控的关键姿势约束。与仅基于音乐生成舞蹈运动序列的方法相反,该工作的目标是综合由音乐驱动的高质量舞蹈运动以及用户执行的定制姿势。我们的模型涉及两个用于音乐和运动表示形式的单模式变压器编码器,以及用于舞蹈动作生成的跨模式变压器解码器。跨模式变压器解码器可以通过引入局部邻居位置嵌入来使其合成平滑舞蹈运动序列合成平滑舞蹈运动序列的能力。这种机制使解码器对关键姿势和相应位置更加敏感。我们的舞蹈合成模型通过广泛的实验在定量和定性评估上取得了令人满意的表现,这证明了我们提出的方法的有效性。
translated by 谷歌翻译
尽管事实证明,视听表征适用于许多下游任务,但舞蹈视频的表示,这是更具体的,并且总是伴随着具有复杂听觉内容的音乐,但仍然具有挑战性且没有评估。考虑到舞者和音乐节奏的节奏运动之间的内在结合,我们介绍了Mudar,这是一个新颖的音乐舞蹈表示学习框架,以明确和隐性的方式执行音乐和舞蹈节奏的同步。具体而言,我们根据音乐节奏分析启发的视觉外观和运动提示得出舞蹈节奏。然后,视觉节奏在时间上与音乐对应物对齐,这些音乐由声音强度的幅度提取。同时,我们利用对比度学习在音频和视觉流中隐含的节奏的隐式连贯性。该模型通过预测视听对之间的时间一致性来学习关节嵌入。音乐舞蹈表示以及检测音频和视觉节奏的能力,可以进一步应用于三个下游任务:(a)舞蹈分类,(b)音乐舞蹈检索,以及(c)音乐舞蹈重新定位。广泛的实验表明,我们提出的框架以大幅度优于其他自我监督方法。
translated by 谷歌翻译
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.
translated by 谷歌翻译
Knowledge graphs (KG) have served as the key component of various natural language processing applications. Commonsense knowledge graphs (CKG) are a special type of KG, where entities and relations are composed of free-form text. However, previous works in KG completion and CKG completion suffer from long-tail relations and newly-added relations which do not have many know triples for training. In light of this, few-shot KG completion (FKGC), which requires the strengths of graph representation learning and few-shot learning, has been proposed to challenge the problem of limited annotated data. In this paper, we comprehensively survey previous attempts on such tasks in the form of a series of methods and applications. Specifically, we first introduce FKGC challenges, commonly used KGs, and CKGs. Then we systematically categorize and summarize existing works in terms of the type of KGs and the methods. Finally, we present applications of FKGC models on prediction tasks in different areas and share our thoughts on future research directions of FKGC.
translated by 谷歌翻译
Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.
translated by 谷歌翻译
Graph Neural Networks (GNNs) have shown satisfying performance on various graph learning tasks. To achieve better fitting capability, most GNNs are with a large number of parameters, which makes these GNNs computationally expensive. Therefore, it is difficult to deploy them onto edge devices with scarce computational resources, e.g., mobile phones and wearable smart devices. Knowledge Distillation (KD) is a common solution to compress GNNs, where a light-weighted model (i.e., the student model) is encouraged to mimic the behavior of a computationally expensive GNN (i.e., the teacher GNN model). Nevertheless, most existing GNN-based KD methods lack fairness consideration. As a consequence, the student model usually inherits and even exaggerates the bias from the teacher GNN. To handle such a problem, we take initial steps towards fair knowledge distillation for GNNs. Specifically, we first formulate a novel problem of fair knowledge distillation for GNN-based teacher-student frameworks. Then we propose a principled framework named RELIANT to mitigate the bias exhibited by the student model. Notably, the design of RELIANT is decoupled from any specific teacher and student model structures, and thus can be easily adapted to various GNN-based KD frameworks. We perform extensive experiments on multiple real-world datasets, which corroborates that RELIANT achieves less biased GNN knowledge distillation while maintaining high prediction utility.
translated by 谷歌翻译